Avoid the pitfalls of Unix/NT
client-server applications interoperability
By Daniel Schwartz and Larry Duckworth
Including Windows NT in your distributed client-server or distributed computing future, often integrated with one or more Solaris servers, probably will be inevitable.The question is "when?" not "if?".
When the application will be "partitioned" to operate on several servers, or has high transactions throughput requirements (for "rightsizing" mission-critical applications), several Unix/NT interoperability and portability challenges exist. Release delays, extra development costs, porting problems, interprocess communications (IPC) interoperability complexities and system-level programming burdens are just a few of the after-effects of the technical problems encountered if the wrong Unix-NT applications approaches are used.
The solution is to use a "standardized IPC" approach that avoids these problems and enhances normal results. Companies like GTech and Fidelity Investments have harnessed the powers of NT for mission-critical client-server processing, without the interoperability penalties.
The many benefits of NT's features, plus the de facto standard nature of Microsoft's platforms, will combine to promote NT's use. Owing to the rise of Unix as the recent platform-of-choice for "open" applications, the use of NT will often include interrelationships with Unix. This is especially true if the application is more than simple client-to-single server for decision support or some two-tier processing (e.g. a GUI "fat client" to Sybase or Oracle on a server).
To better understand the potential pitfalls of developing an application that interoperates between Unix and NT, one must only review the well-known limitations inherent in using Named Pipes, sockets and remote procedure calls (RPCs). When developing a Unix and NT application, these limits (versus the stronger standardized IPC approach) are as follows:
Using Named Pipes:
- has limited platform coverage, and needs a LAN Manager gateway at the Unix end;
- has inconsistent per-threads and asynch-operations handling between the two platforms, requiring redundant operating-specific systems programming;
- has major source-code portability limits, reflecting that the Win32 application programming interface (API) is on NT only;
- provides point-to-point only, with no event multicast capability, limiting the flexibility and scalability of n:m transactions and messaging applications;
- has no message queuing selection via prioritization exists -- only FIFO;
- allows no use of shared memory across platforms;
- has no flow control mechanisms for dynamically reacting to changing traffic patterns and traffic surges;
- has no application-level data translation;
- provides no monitoring tools to inquire into message queues, shared memory and signaling.
Using sockets:
- requires very low-level programming -- with related expertise and costs;
- gives inconsistent sockets implementations -- WinSock (NT) versus BSD sockets (Unix);
- restricts you to only byte-flows of data -- very slow for high transactions needs;
- provides point-to-point only, with no event multicast capability, limiting the flexibility and scalability of n:m transaction and messaging applications;
- allows no message queuing options -- only FIFO and no priorities setting;
- allows no use of shared memory across platforms;
- provides no flow-control mechanisms for dynamically reacting to changing traffic patterns and traffic surges;
- provides no application-level data translation;
- provides no monitoring tools eto inquire into message queues, shared memory, signaling.
Using RPCs:
- provides NT only interfaces to DCE's RPC;
- provides no support for other DCE services (i.e. Cell Directory Services, etc.)
- gives you no portability at the API level;
- provides no asynchronous messaging;
- may give you scalability problems for other than simple client-server, as some are known to exist in RPCs;
- gives you message queuing capabilities -- only FIFO, and no message priorities setting;
- allows no use of shared memory across platforms;
- provides no monitoring tools to inquire into message queues, shared memory, signaling.
Finally, the intranodal IPC methods of Unix and Windows NT are not portable and are functionally incompatible. Thus, porting intranodal process-to-process components between Unix and NT requires a duplication of programming costs and time requirements.
Additional limits of client-server DBMS approach
Often, some thought is given to developing distributed client-server applications with a database-centric approach. However, several application portability and performance limitations exist versus using a standardized IPC approach. Those database-centric limitations include:
Portability:
- Applications built on specific platforms such as Windows 3.1 must be redeveloped for other clients, like the Macintosh.
- Hardware dependencies, such as integer size and byte ordering, complicate and impede the portability of an application.
- Software layers including operating systems (e.g. Windows and System 7) and development tools (e.g. Visual Basic, Powerbuilder and FoxPro) create dependencies that exacerbate the portability dilemma.
- â•©Databases themselves are not readily portable or easily integrated; SQL is, in fact, limited and varying.
Scalability:
- Two-phased commits are not high-transactions supportive for mission-critical applications.
- Implementing business process logic on the desktop imposes the need for powerful and costly workstations as well as many "back-end" servers. Administration of this "mushroom" will become unwieldy on an enterprisewide scale. According to Forrestor, relatively few client-server applications have been placed into "hard-core" production.
- Support organizations armed with totally reliable and constantly available remote administration capabilities are necessary in order to maintain the system's integrity.
Flexibility
- Each database vendor has methods to promote a degree of proprietary dependence.
- User interfaces, such as those for voice processing systems and fax servers have platform-specific requirements. Incorporating these and other devices in your environment will be a near-impossible chore in a purely client-server setting.
Obviously, several important limits exist when using "normal" development methods to build applications that are interoperable (and portable) between Solaris and NT. Using standardized IPC avoids the interoperability and portability pitfalls and harnesses the best of the two operating systems.
Unix/NT interoperability challenges defined
The native IPC mechanisms provided by the Unix and NT operating systems are semantically different from one another, incompatible with one another, and generally resistant to straightforward code portability and interoperability. This chasm poses a difficult challenge for developers who must either interoperate or port applications that are heavily dependent on IPC devices for their full processing potential.
The challenge has become an opportunity for tools developers to "soften the blow" of developing for NT, while continuing to support numerous other operating systems (e.g. Unix, OS/2, VMS, NeXT). Companies have emerged that provide tools for porting GUIs, databases and a variety of shell development tools.
Using standardized IPC tools, semantics are consistent, source-code investment is preserved and functionality is unchanged between platforms. With regard to functionality, there is an additional aspect that must be considered when porting to NT.
While source-code interoperability problems can be approached using familiar code-layering and aliasing tricks, the problems that arise because of the mismatch of critical IPC functionality can be much more difficult to solve. In some cases the disappearance, or dilution, of certain IPC functionality can require an application to be sub-architected for the less-functional environment.
The IPC services found within Windows NT are quite rich in many aspects. However, they vary from Unix in several ways for message queuing, signal synchronization and shared memory.
From the perspective of a developer moving to Windows NT, there is generally one objective to be kept in mind: The less change the better. This rule applies to all aspects of the port, no less so regarding IPC. Thus, it is important to measure how much change is imposed on a developer introducing IPC code interoperable between Unix and NT
The interprocess communications (IPC) services for message queuing, signal synchronization and shared memory within Windows NT and Unix vary in several ways. The developer moving to Windows NT will want to minimize the amount of change required when introducing IPC code that is interoperable between Unix and NT.
Messaging
- Windows NT Pipes. The basic model provided by NT for supporting data flow within a multitasking application is that of a data pipe. Pipes in NT are used for connecting two processes/threads that are concurrently active. The pipe is inherently tied to (actually between) these two processes/threads.
Messaging is probably the weakest (and, thus, most vulnerable) link in NT's IPC story. NT pipes are arguably inferior when compared with similar functionality on Unix, and to a lesser extent OS/2 and VMS. That's not to say that the NT pipes are ineffective. Rather, from the comparative perspective of a developer, the IPC area of functionality that is most likely to cause consternation is that of porting the messaging functionality needed by the developer's applications.
The following three sections examine three aspects of application messaging that are most likely to be cause for concern for a developer contemplating application migration of NT.
- Connection orientation. NT pipes are connection-oriented. This means that processes/threads using a pipe must be active for the life of the pipe. Said differently, the existence of a pipe and its data depends on processes/threads actively attached to opposite ends of the pipe at all times. Otherwise the pipe and its data are lost. Applications built on an IPC mechanism that assume certain latency between the processing times of the involved processes/threads cannot readily employ NT pipes, without requiring changes in the application. It is hard to imagine how one can overcome this limitation without considerable programming effort involving files or similar mechanisms.
- Point-to-point (one-to-one) connectivity. Perhaps the most difficult hurdle to overcome in performing certain ports is that the NT pipes mechanism restricts the number of producer and consumer processes/threads on each end of a pipe to one. NT pipes are point-to-point mechanisms. There is no way of using NT pipes for supporting a common communication channel between 'm' producers and 'n' consumers (i.e. m:n connectivity). It is obvious that this design is drawn from the client-server model of the BSD Socket Library. The problem with this approach is that it becomes very difficult to implement scalable and load-balancing application architectures. The ability to dynamically start multiple server copies to help release messaging traffic pressure from clients is precluded, because there is no common path between clients and servers. Thus, scalability according to traffic flow requirements is not readily possible.This is a serious problem for applications that depend on such scalability. Transaction processing applications, in particular, are almost universally in need of this functionality. Here again there is no easy work-around for moving from an IPC environment supporting n:m scalability to Windows NT. This limitation can cause significant redesign requirements.
- FIFO. NT pipes are not message queues. NT pipes are unimodal in the sense that bytes/messages move in and out of the pipe only in FIFO order. There is no alternative mode, no message priorities, no out-of-band messaging, no message filtering and no message selection. Applications that depend on any of the capabilities that derive from message priorities (message filtering, message selection, etc.) must be modified to address this lost functionality.
- Unix message queues. Unlike NT, Unix System V provides message queues for supporting the movement of data between processes/threads. Unix queues are not connection-oriented, are not point-to-point restricted and are not FIFO devices. While being far from perfect, the message queuing model offered by Unix is closely aligned with the needs of a large segment of developers building complex multitasking applications. A problem with Unix message queues is that they are not supportive of asynchronous functionality, making it difficult to employ them within event-driven applications. While this limitation can be circumvented using Unix domain sockets, this approach brings us full circle in that, like NT pipes, socket mechanisms are connection-oriented, are point-to-point restricted and are FIFO devices. Developers who have built their applications on the Unix message queuing mechanisms are, in many cases, likely to find the mapping from Unix queues to NT pipes a serious porting obstacle.
- Standardized IPC message queues. From the vantage point of a developer, standardized IPC message queues are an ideal virtual target platform. That is because standardized IPC message queuing functionality encompasses a superset of the message queuing functionality found in the Unix and NT operating systems. This being the case, such development requires little if any architectural redesign. Like Unix System V queues, standardized IPC provides message queues that are not connection-oriented, are not point-to-point restricted and are not exclusively FIFO devices. There are many other features of standardized IPC queuing that significantly distinguish it (e.g. individual queue sizing, queue triggers, queuing priorities, queue overflow spooling, etc.). It is worth noting that standardized IPC queuing, when used on Windows NT, can be readily integrated within NT's asynchronous environment -- an inherent requirement of any Windows NT programming tool. Thus, standardized IPC queuing incorporates the major features of both operating systems, normalizes them and allows a higher-level abstraction. Also, these queuing abstractions apply for VMS, OS/2, OSF/1, OS400 and Tandem, allowing significant interoperability and porting benefits.
Synchronization
A second form of IPC functionality that is important to developers of complex applications is the set of tools used for synchronization between cooperating processes and threads. Unlike messaging, the topic of interprocess synchronization between processes has not benefited from recent industry hype regarding client-server and object-oriented technology. And, in fact, the number of applications built on messaging far exceeds those that depend exclusively on synchronization primitives.
It is somewhat ironic, then, that it is in this area that Windows NT IPC services are strongest. Below is a brief outline of the nature of NT's synchronization tools, followed by a comparative survey of synchronization tools available in Unix.
- Windows NT synchronization. Windows NT's set of synchronization tools offers extensive functionality. It is arguable, in this regard, that Windows NT is stronger than the other operating systems being discussed. The list of synchronization mechanisms include mutexes, events and semaphores. (Other tools provided are critical section objects and interlock access variables, but they are, in fact, derivatives of the three we examine.) The strength in the NT synchronization tools lies not only in their variety, but also in their ability to be integrated within general-purpose blocking functions that can wait for a wide range of application events to occur. Mutexes are used for enforcing mutually exclusive access to resources of contention within an application. A mutex can be "owned" by only one thread at a time. This guarantees serialized access to the shared resource. The most typical use of mutexes is for enforcing serialized access to memory being shared by multiple processes/threads. Events are simply two-state flags. The strength of event objects lies in their ability to be employed as leverage points for notifying many processes/threads of the occurrence of an event. This form of event multicasting is very powerful for event-driven applications. NT semaphores are generalized forms of mutexes. That is, while mutexes are used for controlling serialized and exclusive access to a resource (i.e. limit of simultaneous access is one), semaphores allow a number of processes/threads to simultaneously access a resource, where the limit can be programmer-defined, (i.e. limit of simultaneous access is 'n'). Viewed from a more technical perspective, NT semaphores are integer-like variables that maintain a count between zero and some maximum value, where the maximum value defines the degree of simultaneous access to be allowed. This class of semaphore is often referred to in the computing industry as counting semaphores.
- Unix synchronization. A developer intent on moving an application that is heavily dependent on process/thread synchronization from Unix to NT is not likely to hit any major barriers in developing the requisite mapping between the two sets of synchronization primitives. The set of services provided by Unix are a subset of the NT set. Unix supports only one form of synchronization primitive -- the Unix semaphore. In fact, the Unix semaphore is functionally identical to the NT semaphore, except that Unix semaphores cannot be easily integrated within Unix's general-purpose blocking functions (i.e. that select and poll system calls), and similarly cannot be easily employed for asynchronous operations.
- Standardized IPC synchronization. Standardized IPC semaphores support resource and event semaphores -- much along the lines of the platforms mentioned above. Mapping from native synchronization facilities to standardized IPC is, accordingly, a fairly straightforward exercise. The main advantage of standardized IPC semaphores is their uniformity over not only NT and Unix, but also for OS/2, VMS, OSF/1, OS400 and Tandem.
Memory-sharing
Shared memory is the granddaddy of all IPC mechanisms, and the basic functionality supported on popular multitasking systems has not changed greatly over the years.
- Windows NT memory-sharing. Developers doing multitasking applications who employ shared-memory services on one of the popular multitasking operating systems will find familiar memory-sharing services when porting to Windows NT. NT memory-sharing between processes/threads is supported through the use of memory mapped files. Processes/threads wishing to share memory map a common file into their memory space via the system's file mapping facilities, and the memory is then effectively shared.
- Unix memory-sharing. Unix shared memory is similar in concept to the NT service, so that porting application logic involving shared memory use requires a trivial amount of work. Like NT, Unix shared-memory functionality can be achieved via system memory of memory-mapped files.
- Standardized IPC memory-sharing. Standardized IPC shared memory provides traditional shared memory capabilities, as well as a certain measure of built-in synchronization functionality. Thus, it is possible to translate all or most kinds of memory-sharing logic used in a developer's application to or from the standardized IPC memory-sharing model. Standardized IPC shared-memory facility is unique from the other systems mentioned in that it provides dynamic byte-level locking and protection. This capability is most useful for applications that need to manage tables of application data in high-performance settings.
Standardized IPC segments can be used for building such application memory-resident databases without the need for a full DBMS. The advantages of such as approach are that it avoids the large footprint of a DBMS within an otherwise skinny application, and, in most cases, the standardized IPC solution will perform faster, because traditional DBMS data dictionary logic is not necessary.
When developing applications that interoperate between Solaris and NT (and possibly other operating systems such as OS/2, NextStep, ESIX, MIPS-OS, UnixWare, OS400 and Tandem), the standardized IPC abstracts away the differences in message queuing, shared memory and signaling. The best of the combined systems is maintained, versus often being compromised with other approaches.
Additionally, one set of application programming interfaces results in less low-level programming, one code base, locations transparency and cross-platform portability, and interoperability performance enhancements.
Daniel Schwartz is a founder and vice president of Momentum Software Corp. Before founding Momentum, he served four years in the networking systems group at Bell Labs. Larry Duckworth is president and CEO of Momentum Software Corp. From 1985 to 1993, he was president of Intercomputer Communications Corp., which was acquired by DCA. He is also president of the International Message Oriented Middleware Association.
To obtain a white paper that more completely details the interoperability and portability challenges in today's heterogeneous computing environment, contact Momentum at 401 South Van Brunt St., Englewood, N.J., 07631, 800-767-1462.